Columbia University

Course: STATW4701 EXPLORATORY DATA ANALYS/VISUA (Spring 2016)
Professor: Dr. Mengqian Lu
Team members: Aditi Hota, Christopher Lam, Robert Carl Minnich, Xiaohui (Eartha) Guo, Xiaowei Zhang, Yijing (Phoebe) Sun

Background

Our story began with an article published in The Washington Post (How financial need sways graduation rates, 2007). This article compares the graduation rates between Pell-eligible students (students who are qualified for federal Pell grants, ie. whose families make less than $50,000 per year) and non-Pell students to evaluate the effectiveness of federal financial aid on student success beyond mere college access. After analysis of data from 1150 schools in the US, it was found that Pell-eligible students in general have lower graduation rates (51%) than non-Pell students (65%). Although results are greatly varied by individual schools, it revealed a fact that some schools didn’t do enough in helping their Pell-eligible students get through college. Our goal here is to futher explore the relationship between federal students aid and graduation rate (by school types) as well as some other factors, for example, default rate on loans, and hopefully this could help people with their college selecting decisions.

Our idea can also be summarized by the following figure:

Final_Diagram

Google Search Trend for “FAFSA”

To start, we wanted to know about our audience - students interested in federal financial aid. We use an aggregated Google search queries dataset (2005 to 2010), which includes the search queries for “FAFSA” -Free application for Federal Student Aid across the country within the five years. We chose data around " Dec, Jan, Feb, March, Apr" as our focus of study, because those are the time periods for students to apply for Financial aid. Since FAFSA is required for every student when applying for federal student aid, we assume it could give us some insights about the time and location factors trending in people’s interests in student aid. The colors indicate different search frequencies within each state given a certain time period. (From Red to Yellow represent “searched most” to “searched least”)

Search

Based on the graphic above, we can see that:

Google Search Trend for “Pell Grants”

We also did a similar graph for the search of “Pell Grants”:

PellGrants

The results are pretty consistent with the previous one, which raises another interesting point: people who search for “FAFSA” are also very likely to search for “Pell Grants”.

We then moved to our first question: how federal financial aid plays into the choice of school.

Aid Disbursements by school:

To begin, we divided schools into three different types in order to compare the Average Aid Disbursements within each type. And here, we choose three Financial Aid packages to study.

Type Example
Public School Ohio State University
Private/Non-Profit School Columbia Unviversity
Proprietary School Kaplan Unviversity



Aid Type Description
FSEOG Federal Supplemental Educational Opportunity Grant: (no repaid)
Perkins Fedueral Perkins Loan: (low interest loan)
FWS Federal Work-Study Program: Provides funds for part time employment


The following chord diagram can help understand how much money is available per student for different types of federal aid. This is split up by institution type.

From this chord diagram, we have some interesting findings:

We then sought to move to the next level: Investigating the success rates of students at different schools receiving various forms of federal aid.

Challenges in finding data

We wanted initially to relate the financial aid types to graduation rate at each school, but we ran into some problems finding the data, since until last year it was not required to be reported:

hechiger
hechiger_blurb

Source: http://hechingerreport.org/an-unprecedented-look-at-pell-grant-graduation-rates-from-1149-schools/

We can see from the data that there are still many schools not reporting their proper statistics and this is most likely because they are trying to hide information that could hurt their enrollment. We made a chord diagram to actually show the information incompleteness in our dataset. In this Chord Diagram, we basically took out all the schools with unreported information in order to see which types of schools are more likely have missing values. For example: Over all schools have no value in the “Completion Rates” field, most of them are proprietary institutions, and only few belong to private schools. Proprietary schools are also more likely to have unreported information on “Default rate on loans” field than others. The other two fields “Debt- Did Finish” and “Debt- Did Not Finish” don’t have a clear preference - all three types of schools are equally likely to generate missing values in these two fields, suggesting schools may be afraid to reveal the persistence of debt even amongst those who obtain a degree and are thus more likely to get a job. The description of the four fields we focus on are as follows:

Although, the dataset is not complete, we still want to have a view of the information we have:

Key Attibutes when choosing schools:

In order to do that, we take out all schools with complete information in the dataset and analyze their performance in the following grouped bar plot:

Interactive Map

Finally, we created an interactive map to better assist our users with their choice of schools based on the location and some other key attributes. How to use this map:

Once you click on individual schools, it will give you the information about school’s name, type, Pell Average Disbursement and so on. Take Columbia University for example:

columbia

We can see that: The average Pell disbursment in Columbia is about $2549.31, and our graduation rate is about 93% with only 1.9% loan default rate on average.

Now Enjoy the Map!